Using ensemble SVM to identify human GPCRs N-linked glycosylation sites based on the general form of Chou's PseAAC

Protein Eng Des Sel. 2013 Nov;26(11):735-42. doi: 10.1093/protein/gzt042. Epub 2013 Sep 18.

Abstract

As the most frequent drug target, G-protein coupled receptors (GPCRs) are a large family of seven transmembrane receptors that sense molecules outside the cell and activate inside signal transduction pathways. Glycosylation is one of the most complex post-translational modifications (PTMs) of proteins in eukaryotic cells. It plays important roles in a variety of cellular functions, including protein folding, protein trafficking and localization, cell-cell interactions and epitope recognition. Therefore, investigating the exact position of glycosylation site in GPCR sequence can provide useful clues for drug design and other biotechnology applications. Experimental identification of glycosylation sites is expensive and laborious. Hence, there is a significant interest in the development of computational methods for reliable prediction of glycosylation sites from amino acid sequences. In this article, we presented an effective method to recognize the sites of human GPCRs by combining amino acid hydrophobicity with ensemble support vector machine. The prediction accuracy, sensitivity, specificity, Matthews correlation coefficient and area under the curve values were 94.4, 89.7, 98.9%, 0.895 and 0.989, respectively. The establishment of such a fast and accurate prediction method will speed up the pace of identifying proper GPCRs functional sites to facilitate drug discovery.

Keywords: N-linked glycosylation; amino acid hydrophobicity; ensemble support vector machine; human G-protein coupled receptors.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acids / chemistry
  • Computational Biology
  • Databases, Protein
  • Glycosylation
  • Humans
  • Hydrophobic and Hydrophilic Interactions
  • ROC Curve
  • Receptors, G-Protein-Coupled / chemistry*
  • Sequence Analysis, Protein / methods*
  • Support Vector Machine*

Substances

  • Amino Acids
  • Receptors, G-Protein-Coupled